// Copyright 2022 Douglas P. Fields, Jr. All Rights Reserved.

`ifdef IS_QUARTUS // Defined in Assignments -> Settings -> ... -> Verilog HDL Input
// This doesn't work in Questa for some reason. vlog-2892 errors.
`default_nettype none // Disable implicit creation of undeclared nets
`endif

/*
Copyright 2022 Douglas P. Fields, Jr.
 
SystemVerilog of an I2C Controller.

See: UM10204 from NXP, Rev 7.0 - 1 October 2021
I2C-bus specification and user manual

This implements a very simple I2C controller
that uses a 4x speed input clock to communicate over
the I2C bus. This first implementation is a single
controller implementation so as not to have to deal with
arbitration or other multi-controller issues, and
will only implement the mandatory features for now.
That means it will not deal with clock stretching.

This initial design is intended only for the 100 kbit/s
or 400 kbit/s speeds. These are called standard and fast mode.

The two bus lines, SCL and SDA (clock & data) are bidirectional
and are tied to high via a pull-up resistor (around 5KOhms,
but there is a formula for it) or current source. When the bus
is free, both lines are high.

----------- Start/Stop

START and STOP are encoded by having SDA change state while
SCL is high. 
START "S" has SDA transition high to low with SCL high.
STOP "P" has SDA transition low to high with SCL high.
These are only generated by the controller.
The bus is busy after a START, and free again a short
time after STOP.
A repeated START "Sr" keeps the bus busy, in leiu of a STOP.

START immediately followed by a STOP is illegal.

------------ Data

Data on SDA must be stable when SCL is high.
SDA data can only change when SCL is low.

The data consists of bytes of 8 bits, transferred MSB first,
followed by an Acknowledge bit.

After every byte, an ACK (or NACK) bit occurs.
The controller generates the SCL clock pulse for this bit,
releasing SDA, so the target generates low on SDA for as long
as the SCL is high to indicate an ACK. a high SCL during the
acknowledge SCL high means a NACK.

After a NACK, the controller can abort with a STOP
or start a new transfer with START.

If the target cannot accept a byte and needs a delay, it can
hold SCL low to force a wait state. When it's ready to continue
it releases SCL (which causes it to go high due to pull-up).

----------- Clock Stretching

Not implemented for now (see Spec 3.1.9).

----------- Address

There are 7 bits of an address; the 8th bit is a Read/Write bit.
The address is sent right after a start.
A logical zero is a WRITE, and a one is a READ.

For writes, the target acknowledges each byte.

For reads, the target acknowledges the first (address) byte.
The target then becomes a transmitter and the controller
acknowledges each byte.
The controller sends a NACK just before the STOP.

If the controller-receiver is going to do a repeated START,
it should send a NACK just before the repeated START.

Addresses starting with four 1's and 0's are reserved and
not implemented by this system.

----------- Internal clock notes

Version 1: Will have it's own 4x speed clock.

Version 2: Can use the system clock with a specified
divider to make a synthetic 4x speed clock.

We use a 4x speed to make every bit 4 parts:
  SCL: low HIGH HIGH low
  
START   SCL: low  HIGH HIGH low
        SDA: HIGH HIGH low  low

STOP    SCL: low  HIGH HIGH low
        SDA: low  low  HIGH HIGH
      
(really, the first and last SDA don't really matter)

For a data bit:

DATA    SCL: low  HIGH HIGH low
      SDA: HIGH -------------
       or: low --------------
      
For an ACK bit:
N/ACK   SCL: low  HIGH HIGH low
        SDA: Z    R    R    Z
We should probably read in the 3rd phase. Maybe 4th.

I'll add another power on state, where we claim to be
busy for N cycles

So, for a 3 byte write transmission, the states are:

POWERON ->
IDLE ->
START ->
WRITE 1 -> ACK 1 ->
WRITE 2 -> ACK 2 ->
WRITE 3 -> ACK 3 ->
STOP ->
IDLE

We will need a register to save the last ACK and if it's NACK
we will go directly to STOP.

-------------- Power on reset
-------------- Bus clear (SCL or SDA stuck low)
 
-------------- Frequencies

Minimum 10kHz per section 4.2.2.
We are implementing the 100/400kHz speed spec
(Standard, Fast modes).
We could try this at Fast-mode Plus (1MHz) if we ever wanted.

This has been tested at 100kHz and 400kHz using 4x clock
as an input. Now, we're going to use a system clock and
just do our own internal delay cycles.

50 MHz clock -> 1.6 MHz clock for 400kHz
  = divide by 31.25 - round up to 32, don't want to exceed 400kHz
  = 1.5625 MHz -> ~391 kHz I2C clock
50 MHz clock -> 400 kHz clock for 100kHz
  = divide by 125 exactly
  
What we'll do is have a counter, and whenever that counter
is non-zero, we'll do nothing (except increment the counter).
When the counter is zero, we'll do a regular cycle.
This avoids having two different clock domains to deal with.
 
*/

// Simulation: Tick at 1ns (using #<num>) to a precision of 0.1ns (100ps)
`timescale 1 ns / 100 ps

 
module I2C_CONTROLLER #(
  // Every this many clocks we should do something, at 4x I2C clock speed
  // Gives just under 1.6 MHz from 50 MHz, for Fast I2C ~400 kHz
  parameter CLK_DIV = 32,
  // Number of bits to count as high as the above
  parameter CLK_CNT_SZ = $clog2(CLK_DIV)
) (
  // Our input system clock which will be internally
  // divided by CLOCK_DIV
  input  logic clk,
  input  logic reset,

  // I2C Bus
  // These need to be connected to a tristate buffered output pin
  input  logic scl_i, // I2C SCL input
  output logic scl_o, // I2C SCL output
  output logic scl_e, // I2C SCL output enabled
  
  input  logic sda_i, // I2C SDA input
  output logic sda_o, // I2C SDA output
  output logic sda_e, // I2C SDA output enabled
  
  // Outputs about what's going on
  output logic busy,    // We're busy communicating
  output logic abort,   // When !busy: The last transaction was aborted
  output logic success, // When !busy: The last transaction was successful
  // TODO: Assert that when ~busy, abort ^ success except after a reset
  
  // Inputs asking for something to do:
  // We only support a 3 byte transaction for now.
  input  logic       activate,  // True to begin when !busy
  input  logic       read,      // True to do a read operation, false for write
  input  logic [6:0] address,
  input  logic [7:0] location,  // Ignored on read
  input  logic [7:0] data,      // Ignored on read
  input  logic       read_two,  // True if we should read two bytes or false for just one
  
  // How many times we shoudl repeat data
  input logic [2:0] data_repeat,
  
  // If it's a read, here are the two data bytes we ("red") read
  // This should probably be considered valid only if `success`
  output logic [7:0] data1,
  output logic [7:0] data2,
  
  // Monitoring outputs
  output logic start_pulse,
  output logic stop_pulse,
  
  // Debugging outputs
  output logic got_ack
);

// Roland suggests that using "initial" is bad, and is only for simulation,
// and it's good to know if you have any X's because "initial" may not be
// synthesizable or reliable. He doesn't use it.
// (Similarly, "logic xxx = something" is bad.)

// Our main states in our I2C state machine
localparam S_POWERON = 3'd0,
           S_RESET   = 3'd1,
           S_IDLE    = 3'd2,
           S_START   = 3'd3,
           S_ADDRESS = 3'd4,
           S_DATA1   = 3'd5,
           S_DATA2   = 3'd6,
           S_STOP    = 3'd7;
          
// Clock divider counter
localparam CLK_DIV_CNT_ONE = { {(CLK_CNT_SZ-1){1'b0}}, 1'b1 }; // 1 in the proper bit width
localparam CLK_CNT_MAX = CLK_DIV - 1;
logic [CLK_CNT_SZ:0] clk_div_cnt;

// Our state machine       
logic [2:0] state = S_RESET; // This is bad, but I'm doing it anyway
logic [1:0] step;
logic [3:0] byte_step; // 0-7 = get/send a bit (MSB first); 8 = send/get ACK
logic [2:0] byte_idx;  // Counts down from 7 to 0 for sending/getting a byte MSB first

// Combinatoric signals
logic is_write; // True if ~k_data, but we force it to 1 since read is not implemented
logic next_got_ack; // Any previously seen ACK or the current ~SDA

// Power up state - we wait a short time before processing when powering
// on or reseting, just in case.
logic [3:0] poweron_counter;

// Storage of what we've been asked to send
logic [7:0] k_address; // Kept address AND READ FLAG as LSB *** !!! - like k_read
logic [7:0] k_location, k_data; // Kept data and location
logic [2:0] k_data_repeat; // How many times to send the data byte again?
logic       k_read_two;

// Collect READ data
// FIXME: Expose these externally, and have an option for reading 1 or 2 bytes
logic [7:0] data_accum;
logic [7:0] out_data1;
logic [7:0] out_data2;


always_comb begin
  // We are writing data IF we are writing the address byte
  // OR it's a data field AND it isn't reading.
  is_write = state == S_ADDRESS || ~k_address[0];

  // If we're writing, we have to read the ACK,
  // which is a LOW SDA signal (or a previously noticed low).
  next_got_ack = got_ack || ~sda_i;
end


always_ff @(posedge clk) begin

  // If we are being reset
  if (reset) begin
    state <= S_RESET;
    // Since this can be asserted for a bunch of clock cycles,
    // update our important external signals
    busy <= 1;
    abort <= 0;
    success <= 0;
    start_pulse <= 0;
    stop_pulse <= 0;
    got_ack <= 0;
    // Turn off our outputs
    sda_e <= 0;
    scl_e <= 0;
    clk_div_cnt <= CLK_DIV_CNT_ONE;
    data_accum <= 8'b0;
    out_data1 <= 8'b0;
    out_data2 <= 8'b0;

    // TODO: If we get a reset when not idle, we should send a stop/abort first,
    // maybe even finish what we're doing first at least to the end of the byte,
    // and know that we're handling a reset as a state so don't do anything different.
    // START followed by STOP is not allowed, per spec (UM10204 page 14 version 7.0).

  end else if (clk_div_cnt != 0) begin ////////////////////////////////////
  
    // Do nothing until we get to our divider
    if (clk_div_cnt == CLK_CNT_MAX)
      clk_div_cnt <= 0;
    else
      clk_div_cnt <= clk_div_cnt + CLK_DIV_CNT_ONE;
    
  end else begin //////////////////////////////////////////////////////////
  
    // Count is 0 on our clock divider now, so move it to 1.
    clk_div_cnt <= CLK_DIV_CNT_ONE;
  
    // Not resetting, not waiting for system clock to hit our divider, so:
    // do our proper 4x speed I2C state machine, finally.
    case (state)

      S_RESET: begin /////////////////////////////////////////////////
        poweron_counter <= 0;
        busy <= 1; // We're powering up
        // We have no previous status
        abort <= 0;
        success <= 0;
        // Turn off our outputs
        sda_e <= 0;
        scl_e <= 0;
        // TODO: If we were doing something else, send a STOP signal.
        state <= S_POWERON;
        scl_o <= 0;
        sda_o <= 0;
        step <= 2'b0;
        byte_step <= 4'b0;
        byte_idx <= 3'h7;
        poweron_counter <= 0;
        start_pulse <= 0;
        stop_pulse <= 0;
        got_ack <= 0;
      end // S_RESET case
    
      S_POWERON: begin ////////////////////////////////////////////////////////
        // Wait a while before we start accepting requests.
        //
        if (&poweron_counter) begin
          // Reminder: unary & is a "logical and all these bits together"
          state <= S_IDLE;
          busy <= 0;
          poweron_counter <= 0;
        end else begin
          busy <= 1;
          poweron_counter <= poweron_counter + 4'd1;
        end
      end // S_POWERON case
      
      S_IDLE: begin ////////////////////////////////////////////////////////
        if (activate) begin
          // Save data for sending, so caller can change it once busy is asserted
          k_address[7:1] <= address;
          k_address[0] <= read;
          k_location <= location;
          k_data <= data;
          k_data_repeat <= data_repeat;
          k_read_two <= read_two;
          state <= S_START;
          step <= 0;
          start_pulse <= 1; // Pulse the start just before we actually start a transaction so a scope can record it
          stop_pulse <= 0; // In case this was set
          busy <= 1;
          success <= 0;
          abort <= 0;
          // Should we do anything with the SDA/SCL here?
          // Might as well pull them low. (See I2C spec section 3.1.1 and arbitration sections?)
          // (If multi-controller, we would want to check the inputs...)
          sda_e <= 1; sda_o <= 0;
          scl_e <= 1; scl_o <= 0;
        end else begin
          busy <= 0;
          scl_e <= 0;
          sda_e <= 0;
          start_pulse <= 0;
          stop_pulse <= 0;
        end
      end // S_IDLE case
        
      S_START: begin ////////////////////////////////////////////////////////
        // SDA H->L while SCL H
      
        case (step)
          0: begin
              scl_e <= 1; scl_o <= 0; step <= 2'd1;
              sda_e <= 1; sda_o <= 1; 
              start_pulse <= 0;
            end
          1: begin
              scl_o <= 1;             step <= 2'd2;
            end
          2: begin
              sda_o <= 0;             step <= 2'd3;
            end
          3: begin
              scl_o <= 0;             step <= 2'd0;
              state <= S_ADDRESS;     byte_step <= 4'd0;
                                      byte_idx <= 3'd7;
            end
        endcase // S_START step
      end // S_START case

      S_ADDRESS, ////////////////////////////////////////////////////
      S_DATA1,   ////////////////////////////////////////////////////
      S_DATA2: begin  ////////////////////////////////////////////////////
        // HANDLE CURRENT BIT SENDING (or getting if ACK)
        
        if (byte_step == 4'd8) begin
          // HANDLE ACK
          // NB: The final byte READ by controller should be followed
          // by a NACK
          
          if (is_write) case (step) // =================================
            // When we are writing, we need to READ an ack, so we need
            // to turn off our data output and read the data input.
            // The receiver sends an ACK by pulling the data line low,
            // remaining stable LOW during the HIGH of the clock pulse.
            // (Section 3.1.6 of UM10204)
          
            0: begin
                // change data signal only on this step
                scl_e <= 1; scl_o <= 0;
                sda_e <= 0; // This should tristate the output and allow reading sda_i
                got_ack <= 0;
              end
            1: begin
                scl_o <= 1; // Start of step 2 the clock will be 1, we can read data
                // TODO: Should I check sda_i here too?
                // TODO: Should I check sda_i here too?
              end
            2: begin
                // Clock stays high
                // ACK is sda LOW; check
                got_ack <= next_got_ack;  // got_ack is 0, so this is just ~sda_i
              end
            3: begin
                // Might as well check again in case receiver is slow.
                // Also, set our sda output to the same as the ack so
                // we don't get a transient pulse if sda_o was previously different
                // when sda_e goes high again.
                // (usually this will be low since we will usually get an ACK?)
                // (We look to have been making a 4xI2C clock runt)
                // FIXME: This doesn't seem to work...
                got_ack <= next_got_ack;
                scl_o <= 0;
                sda_o <= ~next_got_ack;
              end

          endcase else case (step) // =================================
            // Do our is_read (!is_write) ACK
            // If this is the last byte we're reading, we NAK.
            // Otherwise we ACK if we want another byte.

            0: begin
              $display("CONTROLLER: read byte %02h @ %d", data_accum, $time);
              if (state == S_DATA1)
                out_data1 <= data_accum;
              else if (state == S_DATA2)
                out_data2 <= data_accum;
              else begin
                $display("CONTROLLER: Should never read on ADDRESS step");
                $stop;
              end

              scl_o <= 0; scl_e <= 1;
              // Send ACK or NAK after reading a byte.
              // We always NAK if we're at DATA 2, or at DATA 1 and
              // reading only one byte. Otherwise we ACK.
              if (state == S_DATA2 ||
                  state == S_DATA1 && !k_read_two) begin
                // NAK if this is DATA 2 we just read, cause we're done reading.
                // Let's send a NAK by simply not pulling down SDA
                sda_e <= 0;
                sda_o <= 1; // for fun
              end else begin
                sda_e <= 1;
                sda_o <= 0; // NAK is pulling down SDA
              end
            end

            1, 2: scl_o <= 1; // Pulse the clock
            
            3: begin
              scl_o <= 0;
              sda_e <= 0;
            end
          
          endcase // is_read case for address/data

          
        end else begin
          // HANDLE DATA BIT
          
          // Reset our "saw an ACK" flag for all non-ACK bits
          got_ack <= 0;
        
          if (is_write) case (step) // =================================
            0: begin
              // change data signal only on this step
              scl_e <= 1; scl_o <= 0;
              sda_e <= 1; 
              // What bit do we send?
              sda_o <= state == S_ADDRESS ? k_address [byte_idx] :
                       state == S_DATA1   ? k_location[byte_idx] :
                                            k_data    [byte_idx];
            end
            1, 2: scl_o <= 1; // Hold SDA constant while SCL is high
            3:    scl_o <= 0;
          endcase else case (step) // =================================
            // READ a data bit from the caller
            0: begin
              scl_e <= 1; scl_o <= 0;
              sda_e <= 0;
            end
            1: begin
              scl_o <= 1;
            end
            2: begin
              // Read the SDA now, it should be stable from target.
              // $display("CONTROLLER: read bit %d @ %d", sda_i, $time);
              // And shift it in
              data_accum <= { data_accum[6:0], sda_i };
            end
            3: begin
              scl_o <= 0;
            end
          endcase

        end // ACK or DATA bit?
        
        // NEXT STEP/STATE
    
        // What's our next state after ACK?
        if (byte_step == 4'd8 && step == 2'd3) begin
          // 1. We're at the very end of a byte, so advance state.
          // Remember byte_step 8 is the ACK.
          
          // Check if we got_ack first if we're writing;
          // if we didn't get the ACK we should abort by jumping to stop.
          if (is_write && !next_got_ack) begin
            // We're writing, and we didn't get the ack,
            // so we should abort our transaction and send a STOP.
            // (Section 3.1.6 says we can also START again.)
            abort <= 1'b1;
            state <= S_STOP;
          
          end else if (state == S_DATA2) begin
            if (k_data_repeat == 0 || !is_write) begin
              // No more repeats (we never repeat when we are reading)
              state <= S_STOP;
              // We didn't not get any ACKs so, we're good!
              success <= 1'b1;
            end else begin
              // We have to repeat the DATA2 step
              k_data_repeat <= k_data_repeat - 3'd1;
            end
          end else if (state == S_DATA1 && k_address[0] && !k_read_two) begin
            // We're READING, and only want ONE BYTE, so we're done
            state <= S_STOP;
            success <= 1'b1;
          end else begin
            // We are in ADDRESS or DATA1 (not DATA2)
            state <= state == S_ADDRESS ? S_DATA1 : S_DATA2;

                              // We're in DATA1, go to DATA2 if we're reading 2 bytes or writing.
          end
          byte_step <= 4'd0;
          byte_idx <= 3'd7;
          
        end else if (step == 2'd3) begin
          // 2. In the middle of the byte: advance our byte step/idx
          byte_step <= byte_step + 3'd1;
          byte_idx <= byte_idx - 2'd1; // Ignore underflow
        end
        
        // 3. We always advance our step
        step <= step + 2'd1; // allow overflow from 3 to 0
      end // S_ADDRESS/DATA# case
      
      S_STOP: begin /////////////////////////////////////////////////
        // SDA L->H while SCL H
      
        case (step)
          0: begin
              scl_e <= 1; scl_o <= 0; step <= 2'd1;
              sda_e <= 1; sda_o <= 0; 
            end
          1: begin
              scl_o <= 1;             step <= 2'd2;
            end
          2: begin
              sda_o <= 1;             step <= 2'd3;
            end
          3: begin
              scl_o <= 0;             step <= 2'd0;
              state <= S_IDLE;
              stop_pulse <= 1; // We're done
            end
        endcase // S_START step
      
      end
        
      default: //////////////////////////////////////////////////////
        state <= S_RESET;
        
    endcase // state
  
  end // not reset

end // Always Clock Edge

endmodule // I2C_CONTROLLER




`ifdef IS_QUARTUS // Defined in Assignments -> Settings -> ... -> Verilog HDL Input
// Restore the default_nettype to prevent side effects
// See: https://front-end-verification.blogspot.com/2010/10/implicit-net-declartions-in-verilog-and.html
// and: https://sutherland-hdl.com/papers/2006-SNUG-Boston_standard_gotchas_presentation.pdf
`default_nettype wire // turn implicit nets on again to avoid side-effects
`endif
